Wednesday·02·October·2013
How to make wget honour Content-Disposition headers //at 16:12 //by abe
Download links often point to CGI scripts which actually generate (or
just fetch, i.e. proxy) the actual file to be downloaded, e.g. URLs
like http://www.example.com/download.cgi?file=foobar.txt
.
Most of such CGI scripts send the real file name in the Content-Disposition
header as specified
in the MIME Specification.
All browsers I know (well, at least those I use regularily :-) handle that perfectly and propose the file name sent in the Content-Disposition header as file name for saving the downloaded name which is usually exactly what I want.
All browsers do that, …, just not my favourite commandline download tool GNU Wget … Downloading the above URL with wget would look like this with default settings:
$ wget 'http://www.example.com/download.cgi?file=foobar.txt' --2013-10-02 16:04:16-- http://www.example.com/download.cgi?file=foobar Resolving www.example.com (www.example.com)... 93.184.216.119, 2606:2800:220:6d:26bf:1447:1097:aa7 Connecting to www.switch.ch (www.example.com)|2606:2800:220:6d:26bf:1447:1097:aa7|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2020 (2.0K) [text/plain] Saving to: `download.cgi?file=foobar.txt' 100%[============================================>] 2,020 --.-K/s in 0s 2013-10-02 16:04:24 (12.5 MB/s) - `download.cgi?file=foobar.txt' saved [2020/2020]
Meh!
But luckily Wget can do that, it’s just not enabled by default — because it’s an experimental and possibly buggy feature, at least according to the man page. Well, works for me! :-)
You can easily enabled it by default for either your user or the whole
system by placing the following line in your ~/.wgetrc
or /etc/wgetrc
:
content-disposition = on
Given the CGI script sends an appropriate Content-Disposition header, the above output now looks like this:
$ wget 'http://www.example.com/download.cgi?file=foobar.txt' --2013-10-02 16:04:16-- http://www.example.com/download.cgi?file=foobar Resolving www.example.com (www.example.com)... 93.184.216.119, 2606:2800:220:6d:26bf:1447:1097:aa7 Connecting to www.switch.ch (www.example.com)|2606:2800:220:6d:26bf:1447:1097:aa7|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2020 (2.0K) [text/plain] Saving to: `foobar.txt' 100%[============================================>] 2,020 --.-K/s in 0s 2013-10-02 16:04:24 (12.5 MB/s) - `foobar.txt' saved [2020/2020]
Now Wget does what I mean!
You can also set this as flag on the commandline, but typing
wget --content-disposition …
everytime is surely
not what I want. ;-)
Tagged as: CGI, CLI, Content-Disposition, download, howto, HTTP, Shell, UUUCO, wget
// show without comments // write a comment